- The paper introduces a method that integrates LLM translations with PDDL planning refined by genetic algorithms to better meet human-specified objectives.
- It employs an LSTM-based model to evaluate plan adherence and iteratively optimize candidate solutions through evolutionary operations.
- Experimental evaluation in naval disaster response scenarios shows significant improvement in alignment of generated plans with human intent.
Aligning LLM+PDDL Symbolic Plans with Human Objective Specifications through Evolutionary Algorithm Guidance
The paper presents an innovative approach combining the strengths of LLMs and Planning Domain Definition Language (PDDL) symbolic planners to generate plans that align more closely with human-specified objectives. The introduction of an evolutionary algorithm into this process addresses inherent translation inaccuracies from natural language to PDDL, thereby improving plan fidelity to human intent.
Introduction
Automated symbolic planners, particularly those utilizing PDDL, have long facilitated optimal plan generation from formally specified domains and goals. However, translating human intent into these machine-processable formats remains a challenge. Recent advances using LLMs to bridge this gap have shown promise but fall short when initial translations contain errors that distort user intent. The proposed framework leverages LLM capabilities to convert natural language feedback into symbolic specifications, augmented by an evolutionary algorithm that iteratively refines plan adherence to specified human objectives.
Technical Approach
User Interaction and LLM Utilization
The system begins with an LLM translating user-provided natural language feedback into initial PDDL constraints. This initial translation often results in imprecise symbolic specifications that are subsequently refined. Users interact with an interface to submit feedback, which the LLM interprets into symbolic language grounded in the task domain. Table examples illustrate these transformations, emphasizing the translation's potential pitfalls in directly capturing nuanced human preferences.
Symbolic Planner and Genetic Algorithm Integration
A symbolic planner generates an initial plan based on translated specifications. The LLM's translations typically introduce specifications that may not effectively align with user intentions or domain constraints. Here, a genetic algorithm comes into play, systematically exploring variations of these specifications to enhance plan adherence. By generating a population of candidate solutions, the algorithm applies crossover and mutation operations, guided by adherence rates evaluated against an LSTM-based model.
Specification Adherence and Fitness Evaluation
The fitness of each evolved candidate solution is measured by a Specification Adherence Model, which quantifies the degree to which a plan adheres to original human intent. This model employs a neural architecture to evaluate the alignment of symbolic plans with feedback statements. Fitness scores then inform the selection and evolution of candidate specifications, optimizing plans over successive generations.
Evaluation and Results
Scenario Setup
The framework was tested in a naval disaster response scenario, simulating post-disaster environments requiring debris clearance. Domain-specific constraints such as asset movements and debris management illustrate the complexities of translating user objectives into operational plans.
Experimental results demonstrate the system's capability to generate plans more closely aligned with user objectives than those produced by LLM translations alone. The genetic algorithm significantly improved adherence rates for most constraint archetypes, although challenges remain with scenarios demanding extended plan horizons or where objectives include multiple disjoint actions.
Conclusion
The paper underscores the potential of integrating genetic algorithms with neurosymbolic frameworks to enhance plan adherence in dynamic environments. While results are promising, further exploration of adherence model architectures and the optimization of computational resource use remains essential. Future directions include refining the training dataset and exploring different neural architectures to improve robustness and scalability in complex planning domains.