AI Research Assistant for Computer Scientists
Overview

InternLMMath builds on InternLM2 to improve abstract reasoning in LLMs for mathematics, introducing novel training methods that establish SOTA performance in mathematical reasoning benchmarks.

The model uses a complex pretraining phase with diverse data sources and deduplication techniques to deepen mathematical understanding, followed by supervised finetuning that enhances problemsolving capabilities.

It integrates reasoning and coding abilities in a seq2seq format called RICO and employs the LEAN theorem prover for formal reasoning, bridging informal reasoning and formal verification.

Incorporates reward modeling to refine solution and reasoningpath selection, offering potential in education, automated theorem proving, and formal verification.
Enhancing Mathematical Reasoning in LLMs Through InternLMMath
Introduction
InternLMMath emerges as a notable advancement in the domain of mathematical reasoning LLMs, building on its predecessor InternLM2. This work seeks to significantly enhance the abstract reasoning capabilities of LLMs, especially in the mathematical domain, by introducing a series of novel methodologies and integrating them into a coherent training regimen. The resulting model demonstrates superior performance across a variety of mathematical reasoning benchmarks, establishing new stateoftheart (SOTA) metrics in the field.
Advancements in PreTraining and FineTuning
The InternLMMath model introduces an intricate pretraining phase, leveraging a diverse data corpus that includes common crawl data, domainspecific data, and synthetic data aimed at reinforcing the model's numerical operation capabilities. This pretraining strategy not only enriches the model's understanding but also its application of mathematical concepts in diverse contexts. The use of deduplication techniques and exact formulation decontamination further refines the quality of the training data, ensuring a high degree of relevance and accuracy in the model's learning process.
Subsequently, the supervised finetuning (SFT) phase of InternLMMath's development focuses on a multidimensional enhancement of the model's capabilities. This phase incorporates chainofthought reasoning, code interpretation, and an innovative approach to augmenting mathematical problems. These facets collectively boost the model's ability to not only solve mathematical problems but also to generate new problems and verify the correctness of its solutions, thereby supporting a selfimproving mechanism within LLMs for math reasoning.
Unification of Reasoning and Coding Abilities
A noteworthy innovation in InternLMMath is the unification of reasoning and coding abilities under a unified seq2seq format, termed Reasoning Interleaved with Coding (RICO). This approach enables the model to interleave mathematical reasoning with coding sequences, offering a more natural and humanlike problemsolving process. The integration of formal reasoning, through the use of the LEAN theorem prover, further distinguishes InternLMMath by enabling it to tackle formal mathematical statements, bridging the gap between informal natural language reasoning and formal mathematical verification.
Leveraging Reward Modeling
Another significant aspect of InternLMMath is its incorporation of reward modeling for improving the selection of reasoning paths and solutions. By employing both outcome reward models (ORM) and process reward models (PRM), InternLMMath can more accurately identify and prioritize correct reasoning processes and solutions. This method not only enhances the model's performance on benchmark tasks but also aids in the generation of highquality, verifiable data for selfimprovement.
Practical Implications and Future Directions
InternLMMath’s advancements present several practical implications for the field of AIdriven mathematical reasoning. Its ability to generate new mathematical problems and verify solutions opens up avenues for automated curriculum development and evaluation in educational contexts. Additionally, the model’s integration of formal reasoning capabilities suggests potential applications in automated theorem proving and formal verification, areas of significant importance in computer science and logic.
Looking forward, InternLMMath sets the stage for future explorations into the uncharted territories of AI capabilities in mathematics. Its innovative methodologies and impressive performance lay a foundation for further research into selfimproving systems, possibly leading to LLMs that can autonomously expand their knowledge and reasoning abilities across various domains of mathematics and beyond.
Conclusion
InternLMMath represents a significant step forward in the pursuit of advanced mathematical reasoning abilities within LLMs. Through its comprehensive approach to pretraining and finetuning, along with the integration of code interpretation, formal reasoning, and reward modeling, InternLMMath offers a glimpse into the future of AIdriven mathematics education, research, and application. The remarkable performance across multiple benchmarks underlines the efficacy of these innovations, paving the way for further advancements in the domain of AI and mathematical reasoning.
 Huaiyuan Ying (8 papers)
 Shuo Zhang (154 papers)
 Linyang Li (44 papers)
 Zhejian Zhou (4 papers)
 Yunfan Shao (16 papers)
 Zhaoye Fei (9 papers)
 Yichuan Ma (3 papers)
 Jiawei Hong (4 papers)
 Kuikun Liu (9 papers)
 Ziyi Wang (207 papers)
 Yudong Wang (19 papers)
 Zijian Wu (10 papers)
 Shuaibin Li (2 papers)
 Fengzhe Zhou (6 papers)
 Songyang Zhang (86 papers)
 Wenwei Zhang (63 papers)
 Hang Yan (76 papers)
 Xipeng Qiu (206 papers)
 Jiayu Wang (16 papers)
 Kai Chen (377 papers)
 New extremal binary selfdual codes from F_4 + uF_4lifts of quadratic double circulant codes over F_4 (Kaya et al., 2014)
 Analysis of the second order BDF scheme with variable steps for the molecular beam epitaxial model without slope selection (Liao et al., 2020)
 The Pagoda Sequence: a Ramble through Linear Complexity, Number Walls, D0L Sequences, Finite State Automata, and Aperiodic Tilings (Lunnon, 2009)
 BufferBased Distributed LT Codes (Hussain et al., 2014)
 Low Autocorrelation Binary Sequences: Number Theorybased Analysis for Minimum Energy Level, Barker codes (Ukil, 2015)
 Dfinite Numbers (Huang et al., 2016)
 The Numerical Assembly Technique for arbitrary planar systems based on an alternative homogeneous solution (Gfrerer et al., 2022)
 Towards a unified description of isotopic fragment properties in spontaneous and fusioninduced fission within a 4D dynamical Langevin model (Pomorski et al., 12 Jun 2024)
 ThreeField FluidStructure Interaction by Means of the Variational Multiscale Method (Tello et al., 2020)
 Frame Codes For Distributed Coded Computation (Yosibash et al., 2021)