ARIES: Autonomous Reasoning with LLMs on Interactive Thought Graph Environments (2502.21208v1)

Published 28 Feb 2025 in cs.AI and cs.LG

Abstract: Recent research has shown that LLM performance on reasoning tasks can be enhanced by scaling test-time compute. One promising approach, particularly with decomposable problems, involves arranging intermediate solutions as a graph on which transformations are performed to explore the solution space. However, prior works rely on pre-determined, task-specific transformation schedules which are subject to a set of searched hyperparameters. In this work, we view thought graph transformations as actions in a Markov decision process, and implement policy agents to drive effective action policies for the underlying reasoning LLM agent. In particular, we investigate the ability for another LLM to act as a policy agent on thought graph environments and introduce ARIES, a multi-agent architecture for reasoning with LLMs. In ARIES, reasoning LLM agents solve decomposed subproblems, while policy LLM agents maintain visibility of the thought graph states, and dynamically adapt the problem-solving strategy. Through extensive experiments, we observe that using off-the-shelf LLMs as policy agents with no supervised fine-tuning (SFT) can yield up to $29\%$ higher accuracy on HumanEval relative to static transformation schedules, as well as reducing inference costs by $35\%$ and avoid any search requirements. We also conduct a thorough analysis of observed failure modes, highlighting that limitations on LLM sizes and the depth of problem decomposition can be seen as challenges to scaling LLM-guided reasoning.

Summary

The paper introduces ARIES, a multi-agent framework that enables LLMs to perform autonomous reasoning by acting as policy agents within interactive thought graph environments formulated as Markov Decision Processes.
ARIES achieved up to 29% higher accuracy and reduced inference costs by 35% on benchmarks like HumanEval compared to static methods.
Experiments showed that model scalability is crucial for effective autonomous reasoning, highlighting the need for larger LLMs or new scaling paradigms for success.

Analyzing the ARIES Framework for Autonomous Reasoning with LLMs

The paper "ARIES: Autonomous Reasoning with LLMs on Interactive Thought Graph Environments" presents a comprehensive exploration of using LLMs to autonomously solve reasoning tasks by leveraging interactive environments structured as thought graphs. The primary motivation for this research is to enhance the performance of LLMs on reasoning tasks by adopting a novel approach that extends beyond static, task-specific transformation schedules, which have been the focus of prior work.

Core Contributions

The authors introduce ARIES, a multi-agent architecture that leverages the LLMs' capacity to act as both reasoning and policy agents within thought graphs. They formulate these thought graphs as interactive environments characterized using Markov Decision Processes (MDPs). This formulation allows for dynamic adaptation based on the state of the thought graph and external feedback, potentially leading to more efficient and accurate problem-solving techniques.

Key Findings

Performance Improvements: The research demonstrates that using LLMs as policy agents, even without supervised fine-tuning, can yield significant performance improvements. For instance, ARIES achieved up to a 29% higher accuracy on the HumanEval benchmark relative to static transformation schedules, while also reducing inference costs by 35%.
Scalability Concerns: One of the pivotal insights from the experiments is the limitation in scalability related to the LLM parameter sizes and the depth of problem decomposition. The paper identifies that smaller LLMs (e.g., Llama-70B) show reduced efficacy as policy agents, suggesting that model scalability is intrinsically linked to successful autonomous reasoning.
Transition Probabilities: The authors profile the transition probabilities for each graph transformation and find variability depending on the task. For example, the success probability of the 'refine' transformation in coding tasks is notably lower, impacting the overall performance and strategy adaptation of policy agents.
Ensemble Strategy: To mitigate the stochastic nature of LLM inference, particularly in action selection scenarios, the researchers employ an ensemble of policy agents. This reduces variability in chosen actions and enhances the robustness of the overall reasoning framework.

Theoretical and Practical Implications

The ARIES framework signals a shift towards using dynamic, model-based reasoning strategies that aim to mimic more holistic, human-like reasoning by utilizing existing world knowledge embedded within LLMs. This approach could transform AI applications requiring adaptive problem-solving without extensive pre-defined programming, such as autonomous coding tasks or interactive decision-making systems.

From a theoretical perspective, the use of LLMs as policy agents suggests new avenues for research in the realms of artificial intelligence and machine learning, inviting exploration into optimizing LLM architectures specifically for decision-making tasks.

Future Directions

The limitations identified by the researchers, particularly concerning LLM size and problem decomposition depth, suggest clear pathways for future work. Scaling LLMs, either through structural innovations or through the development of new training paradigms, would be a priority. Moreover, exploring more sophisticated frameworks for reasoning over highly decomposed tasks, perhaps using hierarchical models or hybrid systems that combine reasoning with deep learning, should be explored.

In conclusion, the work presents a significant advancement in reasoning paradigms with LLMs, offering insights and improvements in accuracy and efficiency that prompt further exploration. ARIES stands as a promising approach to create more intelligent, adaptive systems that can handle a broader array of complex reasoning tasks in real-world applications.

YouTube

Show All Videos