Introduction
Recent advancements in LLMs have led to remarkable progress in their application within the field of mathematics. This survey concentrates on mathematical LLMs (LMs), which include Pre-trained LLMs (PLMs) and Large-scale LLMs. These models play a pivotal role in addressing various mathematical tasks such as performing calculations and reasoning, and this ability is transforming mathematical exploration and practical usage.
Mathematical Tasks
Mathematical tasks tackled by LMs fall into two main categories: mathematical calculation and mathematical reasoning. Mathematical calculation primarily involves arithmetic operations and the representation of numerical data. Initially, LMs portrayed basic computational skills using textual number representations; however, over time, they have evolved to handle arithmetic operations more proficiently. Approaches like GenBERT and NF-NSM have inserted numeric data directly into PLMs to enhance their mathematical performance.
Mathematical reasoning, on the other hand, involves solving complex problems that require logical thought processes. Recent studies show that LLMs can generate elaborate chains of thoughts when provided with exemplary reasoning examples, achieving higher success rates on various tasks.
LLM Methodologies
The methodologies employed in achieving mathematical proficiency can be categorized based on their underlying PLMs and LLMs structures. Among PLMs, autoregression (ALMs) and non-autoregression (NALMs) are two significant approaches used to comprehend and generate mathematical expressions. LLMs employ strategies such as instruction learning, tool-based methods, and chain-of-thought (CoT) techniques to improve mathematical reasoning capabilities. They draw on tools like symbolic solvers and computer programs to assist in problem-solving, while also leveraging training techniques like fine-tuning to bolster performance on certain arithmetic tasks.
Datasets and Challenges
Over 60 mathematical datasets have been compiled, which are divided into training datasets, benchmark datasets, and augmented datasets. These datasets are critical in both the training and evaluation of mathematical models and cover a wide range of complexity levels, from basic arithmetic to advanced theorem proving.
Despite the progress, challenges persist, including ensuring faithfulness in model output, enhancing multi-modal capabilities to handle non-textual mathematical information, managing uncertainty in calculations, devising robust evaluation metrics, and finding applications in educational settings as teaching aids.
Conclusion
The intersection of artificial intelligence and mathematical problem-solving is witnessing significant expansion, driven by the innovative capabilities of mathematical LLMs. By addressing existing hurdles and harnessing the potential of PLMs and LLMs, these models are poised to revolutionize the domain of mathematics and its numerous applications. This survey aims to ignite further research by providing a detailed account of current successes, areas for growth, and outlining prospective directions for advancements in this exciting field.