System-1.x: Learning to Balance Fast and Slow Planning with LLMs
The paper entitled "System-1.x: Learning to Balance Fast and Slow Planning with LLMs" introduces a hybrid planning framework that leverages the strengths of both rapid heuristic-based decision-making and meticulous, step-by-step planning processes. This work situates itself within the broader discourse on the limitations and potentials of LLMs in long-horizon planning tasks, offering a sophisticated method that balances speed and accuracy based on problem complexity.
Overview
Traditional LLM planning can be categorized into two distinct modes: System-1 and System-2. System-1 approaches produce plans quickly by heuristics or learned models but often lack robustness and accuracy, especially for complex tasks. Conversely, System-2 approaches incorporate thorough search mechanisms, resulting in higher accuracy but at the expense of computational resources.
The System-1.x Planner, proposed in this paper, strikes a balance between these two paradigms. It consists of three primary components:
- Controller: Decomposes a planning problem into sub-goals and classifies them as either "easy" or "hard."
- System-1 Planner: Addresses the easier sub-goals using fast, heuristic-based planning.
- System-2 Planner: Deals with harder sub-goals through more deliberate, search-based methods.
Methodology
The innovative aspect of System-1.x lies in its controllability, governed by a user-defined hybridization factor x. This factor determines the proportion of fast versus thorough planning used, allowing the controller to dynamically allocate resources based on the perceived difficulty of sub-goals.
Training
To facilitate the training of System-1.x, search traces from various classical planning problems are employed:
- System-1 Data: Simple plans produced heuristically.
- System-2 Data: Search trajectories generated by algorithms such as A∗.
- Controller Data: Derived by decomposing plans into sub-goals using a sliding window technique and classifying them according to their difficulty, defined by a heuristic function.
Evaluation
The paper evaluates System-1.x through experiments on two diverse planning tasks:
- Maze Navigation: Involves navigating a 5x5 maze with obstacles.
- Blocksworld: Requires reconfiguring blocks on a table to a predefined goal state.
Key Results and Analysis
Performance and Efficiency
System-1.x demonstrates superior performance across a range of budgets compared to pure System-1 and System-2 approaches. For instance, in the Maze Navigation task:
- System-1.x achieves an accuracy of 70.4% at approximately 13.6 states explored, surpassing the System-1 and System-2 planners, which obtain 48.7% and 37.2% at comparable states explored.
In the Blocksworld task, which tests out-of-distribution generalization to longer plan lengths:
- System-1.x maintains a higher accuracy at lower #States-Explored, showing significant improvements over System-2, especially when sub-goals simplify the planning process.
Controllability
A notable feature of System-1.x is its controllability, both at training and inference times:
- By adjusting the hybridization factor x, users can fine-tune the balance between speed and accuracy.
- During inference, the controller can be biased towards more System-2 planning if higher accuracy is required, effectively transforming the System-1.x Planner towards a full System-2 Planner without retraining.
Neuro-Symbolic Integration
The potential for combining neural and symbolic methods is also explored:
- Neuro-symbolic System-1.x, which uses A∗ as the System-2 component, outperforms pure symbolic planners like A∗ at matched #States-Explored. For example, at 11.6 states, System-1.x achieves 70.5% accuracy compared to A∗'s 31.0%.
Implications and Future Directions
The implications of this research are significant:
- Practically: System-1.x offers a robust, flexible planning approach suitable for diverse and complex tasks where resource constraints vary.
- Theoretically: It underscores the potential for hybrid models that leverage the best of heuristic-based and search-based planning, aligning with concepts from dual-process theories in cognitive science.
Future developments could explore:
- Scalability: Extending System-1.x to handle larger-scale and more dynamic planning environments.
- Adaptation to Uncertainty: Enhancing the controller to better manage partially observable and non-deterministic environments.
- Further Integration: Seamlessly blending neural and symbolic methods to enhance the adaptability and generality of the planner.
In summary, the System-1.x Planner marks a significant advancement in the application of LLMs to planning tasks, providing a compelling blend of efficiency and accuracy through its hybrid, controllable approach. This sets a promising precedent for the development of more sophisticated, adaptive planning systems in the future.