Data Augmentation with In-Context Learning and Comparative Evaluation in Math Word Problem Solving (2404.03938v1)
Abstract: Math Word Problem (MWP) solving presents a challenging task in NLP. This study aims to provide MWP solvers with a more diverse training set, ultimately improving their ability to solve various math problems. We propose several methods for data augmentation by modifying the problem texts and equations, such as synonym replacement, rule-based: question replacement, and rule based: reversing question methodologies over two English MWP datasets. This study extends by introducing a new in-context learning augmentation method, employing the Llama-7b LLM. This approach involves instruction-based prompting for rephrasing the math problem texts. Performance evaluations are conducted on 9 baseline models, revealing that augmentation methods outperform baseline models. Moreover, concatenating examples generated by various augmentation methods further improves performance.
- Wu, L., Wu, P., Zhang, X.: A seq2seq-based approach to question answering over knowledge bases. In: Semantic Technology: 9th Joint International Conference, JIST 2019, Hangzhou, China, November 25–27, 2019, Revised Selected Papers 9, pp. 170–181 (2020). Springer Fan et al. [2019] Fan, A., Jernite, Y., Perez, E., Grangier, D., Weston, J., Auli, M.: Eli5: Long form question answering. arXiv preprint arXiv:1907.09190 (2019) Jin et al. [2023] Jin, S., Lian, X., Jung, H., Park, J., Suh, J.: Building a deep learning-based qa system from a cqa dataset. Decision Support Systems, 114038 (2023) Abdel-Nabi et al. [2023] Abdel-Nabi, H., Awajan, A., Ali, M.Z.: Deep learning-based question answering: a survey. Knowledge and Information Systems 65(4), 1399–1485 (2023) Rogers et al. [2023] Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fan, A., Jernite, Y., Perez, E., Grangier, D., Weston, J., Auli, M.: Eli5: Long form question answering. arXiv preprint arXiv:1907.09190 (2019) Jin et al. [2023] Jin, S., Lian, X., Jung, H., Park, J., Suh, J.: Building a deep learning-based qa system from a cqa dataset. Decision Support Systems, 114038 (2023) Abdel-Nabi et al. [2023] Abdel-Nabi, H., Awajan, A., Ali, M.Z.: Deep learning-based question answering: a survey. Knowledge and Information Systems 65(4), 1399–1485 (2023) Rogers et al. [2023] Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Jin, S., Lian, X., Jung, H., Park, J., Suh, J.: Building a deep learning-based qa system from a cqa dataset. Decision Support Systems, 114038 (2023) Abdel-Nabi et al. [2023] Abdel-Nabi, H., Awajan, A., Ali, M.Z.: Deep learning-based question answering: a survey. Knowledge and Information Systems 65(4), 1399–1485 (2023) Rogers et al. [2023] Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Abdel-Nabi, H., Awajan, A., Ali, M.Z.: Deep learning-based question answering: a survey. Knowledge and Information Systems 65(4), 1399–1485 (2023) Rogers et al. [2023] Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Fan, A., Jernite, Y., Perez, E., Grangier, D., Weston, J., Auli, M.: Eli5: Long form question answering. arXiv preprint arXiv:1907.09190 (2019) Jin et al. [2023] Jin, S., Lian, X., Jung, H., Park, J., Suh, J.: Building a deep learning-based qa system from a cqa dataset. Decision Support Systems, 114038 (2023) Abdel-Nabi et al. [2023] Abdel-Nabi, H., Awajan, A., Ali, M.Z.: Deep learning-based question answering: a survey. Knowledge and Information Systems 65(4), 1399–1485 (2023) Rogers et al. [2023] Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Jin, S., Lian, X., Jung, H., Park, J., Suh, J.: Building a deep learning-based qa system from a cqa dataset. Decision Support Systems, 114038 (2023) Abdel-Nabi et al. [2023] Abdel-Nabi, H., Awajan, A., Ali, M.Z.: Deep learning-based question answering: a survey. Knowledge and Information Systems 65(4), 1399–1485 (2023) Rogers et al. [2023] Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Abdel-Nabi, H., Awajan, A., Ali, M.Z.: Deep learning-based question answering: a survey. Knowledge and Information Systems 65(4), 1399–1485 (2023) Rogers et al. [2023] Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Jin, S., Lian, X., Jung, H., Park, J., Suh, J.: Building a deep learning-based qa system from a cqa dataset. Decision Support Systems, 114038 (2023) Abdel-Nabi et al. [2023] Abdel-Nabi, H., Awajan, A., Ali, M.Z.: Deep learning-based question answering: a survey. Knowledge and Information Systems 65(4), 1399–1485 (2023) Rogers et al. [2023] Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Abdel-Nabi, H., Awajan, A., Ali, M.Z.: Deep learning-based question answering: a survey. Knowledge and Information Systems 65(4), 1399–1485 (2023) Rogers et al. [2023] Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Abdel-Nabi, H., Awajan, A., Ali, M.Z.: Deep learning-based question answering: a survey. Knowledge and Information Systems 65(4), 1399–1485 (2023) Rogers et al. [2023] Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Rogers, A., Gardner, M., Augenstein, I.: Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys 55(10), 1–45 (2023) Yigit and Amasyali [2019] Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Yigit, G., Amasyali, M.F.: Ask me: A question answering system via dynamic memory networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–5 (2019). IEEE Xie and Sun [2019] Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Xie, Z., Sun, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI, pp. 5299–5305 (2019) Zhang et al. [2020] Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Zhang, J., Lee, R.K.-W., Lim, E.-P., Qin, W., Wang, L., Shao, J., Sun, Q.: Teacher-student networks with multiple decoders for solving math word problem. (2020). IJCAI Liang et al. [2022] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 997–1009 (2022) Wang et al. [2019] Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Wang, L., Zhang, D., Zhang, J., Xu, X., Gao, L., Dai, B.T., Shen, H.T.: Template-based math word problem solvers with recursive neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7144–7151 (2019) Zhang et al. [2020] Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Zhang, J., Wang, L., Lee, R.K.-W., Bin, Y., Wang, Y., Shao, J., Lim, E.-P.: Graph-to-tree learning for solving math word problems. (2020). Association for Computational Linguistics Shen and Jin [2020] Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Shen, Y., Jin, C.: Solving math word problems with multi-encoders and multi-decoders. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2924–2934 (2020) Upadhyay and Chang [2016] Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Upadhyay, S., Chang, M.-W.: Annotating derivations: A new evaluation strategy and dataset for algebra word problems. arXiv preprint arXiv:1609.07197 (2016) Qin et al. [2020] Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Qin, J., Lin, L., Liang, X., Zhang, R., Lin, L.: Semantically-aligned universal tree-structured solver for math word problems. arXiv preprint arXiv:2010.06823 (2020) Kushman et al. [2014] Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 271–281 (2014) Koncel-Kedziorski et al. [2016] Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Koncel-Kedziorski, R., Roy, S., Amini, A., Kushman, N., Hajishirzi, H.: Mawps: A math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1152–1157 (2016) Patel et al. [2021] Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Patel, A., Bhattamishra, S., Goyal, N.: Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191 (2021) Miao et al. [2021] Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Miao, S.-Y., Liang, C.-C., Su, K.-Y.: A diverse corpus for evaluating and developing english math word problem solvers. arXiv preprint arXiv:2106.15772 (2021) Wang et al. [2017] Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 845–854 (2017) Roy and Roth [2018] Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Roy, S., Roth, D.: Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics 6, 159–172 (2018) Mitra and Baral [2016] Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Mitra, A., Baral, C.: Learning to use formulas to solve simple arithmetic problems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2144–2153 (2016) Fletcher [1985] Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Fletcher, C.R.: Understanding and solving arithmetic word problems: A computer simulation. Behavior Research Methods, Instruments, & Computers 17(5), 565–571 (1985) Bakman [2007] Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Bakman, Y.: Robust understanding of word problems with extraneous information. arXiv preprint math/0701393 (2007) Yuhui et al. [2010] Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Yuhui, M., Ying, Z., Guangzuo, C., Yun, R., Ronghuai, H.: Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 2, pp. 476–479 (2010). IEEE Wang et al. [2018] Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Wang, L., Wang, Y., Cai, D., Zhang, D., Liu, X.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018) Yigit and Amasyali [2023] Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Yigit, G., Amasyali, M.F.: Exploring the benefits of data augmentation in math word problem solving. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2023). IEEE Touvron et al. [2023] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023) Lan et al. [2022] Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Lan, Y., Wang, L., Zhang, Q., Lan, Y., Dai, B.T., Wang, Y., Zhang, D., Lim, E.-P.: Mwptoolkit: an open-source framework for deep learning-based math word problem solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 13188–13190 (2022) Hosseini et al. [2014] Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Hosseini, M.J., Hajishirzi, H., Etzioni, O., Kushman, N.: Learning to solve arithmetic word problems with verb categorization. In: EMNLP, pp. 523–533 (2014) Zhou et al. [2015] Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 817–822 (2015) Koncel-Kedziorski et al. [2015] Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Koncel-Kedziorski, R., Hajishirzi, H., Sabharwal, A., Etzioni, O., Ang, S.D.: Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics 3, 585–597 (2015) Huang et al. [2017] Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Huang, D., Shi, S., Lin, C.-Y., Yin, J.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 805–814 (2017) Zhang et al. [2016] Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Zhang, B., Xiong, D., Su, J., Duan, H., Zhang, M.: Variational neural machine translation. arXiv preprint arXiv:1605.07869 (2016) Huang et al. [2018] Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Huang, D., Liu, J., Lin, C.-Y., Yin, J.: Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 213–223 (2018) Chiang and Chen [2018] Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Chiang, T.-R., Chen, Y.-N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018) Li et al. [2019] Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Li, J., Wang, L., Zhang, J., Wang, Y., Dai, B.T., Zhang, D.: Modeling intra-relation in math word problems with different functional multi-head attentions. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6162–6167 (2019) Meng and Rumshisky [2019] Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Meng, Y., Rumshisky, A.: Solving math word problems with double-decoder transformer. arXiv preprint arXiv:1908.10924 (2019) Li et al. [2020] Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Li, S., Wu, L., Feng, S., Xu, F., Xu, F., Zhong, S.: Graph-to-tree neural networks for learning structured input-output translation with applications to semantic parsing and math word problem. arXiv preprint arXiv:2004.13781 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach (2019) Radford et al. [2019] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Shao et al. [2022] Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Shao, Z., Huang, F., Huang, M.: Chaining simultaneous thoughts for numerical reasoning. arXiv preprint arXiv:2211.16482 (2022) Li et al. [2022] Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., Chen, W.: On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336 (2022) Wang et al. [2022] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022) Pi et al. [2022] Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Pi, X., Liu, Q., Chen, B., Ziyadi, M., Lin, Z., Fu, Q., Gao, Y., Lou, J.-G., Chen, W.: Reasoning like program executors. arXiv preprint arXiv:2201.11473 (2022) Chen et al. [2022] Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Chen, W., Ma, X., Wang, X., Cohen, W.W.: Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588 (2022) Liang et al. [2023] Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Liang, Z., Yu, W., Rajpurohit, T., Clark, P., Zhang, X., Kaylan, A.: Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation. arXiv preprint arXiv:2305.14386 (2023) Lazaridou et al. [2022] Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115 (2022) Wei et al. [2022] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837 (2022) Brown et al. [2020] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020) Liu et al. [2021] Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Liu, Q., Guan, W., Li, S., Cheng, F., Kawahara, D., Kurohashi, S.: Roda: reverse operation based data augmentation for solving math word problems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1–11 (2021) Raiyan et al. [2023] Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Raiyan, S.R., Faiyaz, M.N., Kabir, S.M.J., Kabir, M., Mahmud, H., Hasan, M.K.: Math word problem solving by generating linguistic variants of problem statements. arXiv preprint arXiv:2306.13899 (2023) Liang et al. [2021] Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021) Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)
- Liang, Z., Zhang, J., Wang, L., Qin, W., Lan, Y., Shao, J., Zhang, X.: Mwp-bert: Numeracy-augmented pre-training for math word problem solving. arXiv preprint arXiv:2107.13435 (2021)