Large Language Models and Algorithm Execution: Application to an Arithmetic Function

Published 12 Jan 2026 in cs.LG and cs.AI | (2601.07898v1)

Abstract: LLMs have recently developed new advanced functionalities. Their effectiveness relies on statistical learning and generalization capabilities. However, they face limitations in internalizing the data they process and struggle, for instance, to autonomously execute algorithms. In this paper, we investigate the possibility of extending these models' capabilities to algorithm execution through specialized supervised training focused on reasoning decomposition. We introduce a training model called LLM-DAL (LLM - Decompositional Algorithmic Learning), through which we demonstrate that LLMs' ability to perform complex algorithmic inferences and generalize can be significantly improved when the training method is properly designed to guide the model in its learning process.

Abstract PDF Upgrade to Chat

Summary

The paper proposes LLM-DAL, which enhances LLMs' arithmetic reasoning by decomposing complex tasks into simpler subtasks.
It introduces a structured learning framework that breaks multiplication into step-by-step operations, leading to significant accuracy improvements.
Extensive experiments using a synthetic dataset demonstrate that recursive prompting and incremental subtask training boost model generalization over baseline methods.

LLMs and Algorithm Execution: Application to an Arithmetic Function

Introduction

The paper "LLMs and Algorithm Execution: Application to an Arithmetic Function" (2601.07898) explores the capabilities of LLMs in performing algorithmic tasks, specifically focusing on arithmetic operations. Despite the substantial advances in LLMs' ability to generate coherent linguistic outputs, their aptitude for executing algorithms autonomously remains a challenge. The authors introduce a novel model, LLM-DAL (LLM - Decompositional Algorithmic Learning), which aims to enhance LLMs' ability to perform complex algorithmic inferences through a specialized training approach emphasizing reasoning decomposition.

Challenges in Algorithmic Learning

LLMs traditionally struggle with tasks that require genuine understanding and algorithmic execution. Neural networks excel at pattern recognition but often fall short when tasked with executing algorithms or reasoning logically, as highlighted by the Chinese room thought experiment. This paper identifies the need for LLMs to internalize knowledge and reasoning, moving beyond reproducing learned sequences. Although neural network architectures like the Neural Turing Machine attempt to simulate algorithmic logic, significant limitations such as convergence issues persist.

The paper underscores the problem of LLMs being "stochastic parrots" that rely on statistical repetition rather than understanding, especially in tasks like arithmetic reasoning where memorization rather than logic prevails. Research by Razeghi et al. has shown that arithmetic performance is correlated with the frequency of operands in the training corpus, further supporting this limitation.

State of the Art

The development of LLMs capable of algorithmic reasoning reflects a growing interest in enabling these models to perform tasks demanding logical inference. Notable models such as LLEMMA, Mathstral, QwQ-32B-Preview, and OpenAI's o1 and o3, focus on mathematical reasoning and algorithm execution. Despite advances, these models face challenges in coherence and synthesis, signaling the need for improvements in algorithmic cognition.

Proposed Learning Framework

The LLM-DAL approach introduces a learning framework focused on task decomposition, inspired by Chain of Thought (CoT) reasoning. The model breaks down complex tasks into simpler subtasks, facilitating step-by-step learning. For the multiplication task, these subtasks include digit-by-digit multiplication, addition of partial results, digit extraction, and digit concatenation. Through progressive supervised learning across these subtasks, the model is trained to produce logical reasoning descriptions for arithmetic operations.

Experiments and Results

The experimental framework employs an incremental learning process using a series of training corpuses specifically designed for each identified subtask. The training procedure is divided into stages, starting with fundamental arithmetic operations and advancing to global task refinement. The LLM-DAL approach was evaluated using a synthetic dataset, with the model achieving significant improvements in accuracy over a baseline vanilla model. The experiments demonstrate that decomposing tasks into subtasks and utilizing recursive prompting significantly enhance the model's ability to generalize and accurately perform algorithmic tasks.

Conclusion

The research provides a systematic approach to enhancing LLMs' capabilities in algorithm execution through structured learning and decomposition of complex tasks. By focusing on arithmetic functions, a domain traditionally challenging for LLMs, this work illustrates the potential of LLM-DAL to improve reasoning and extend the generalization capabilities of LLMs. Future research directions include applying this methodology to a wider range of algorithmic tasks and exploring unsupervised extraction of CoT reasoning to further increase the models' autonomy and efficiency in learning.

Markdown