Aligning LLM+PDDL Symbolic Plans with Human Objective Specifications through Evolutionary Algorithm Guidance (2412.00300v2)

Published 30 Nov 2024 in cs.AI and cs.NE

Abstract: Automated planning using a symbolic planning language, such as PDDL, is a general approach to producing optimal plans to achieve a stated goal. However, creating suitable machine understandable descriptions of the planning domain, problem, and goal requires expertise in the planning language, limiting the utility of these tools for non-expert humans. Recent efforts have explored utilizing a symbolic planner in conjunction with a LLM to generate plans from natural language descriptions given by a non-expert human (LLM+PDDL). Our approach performs initial translation of goal specifications to a set of PDDL goal constraints using an LLM; such translations often result in imprecise symbolic specifications, which are difficult to validate directly. We account for this using an evolutionary approach to generate a population of symbolic goal specifications with slight differences from the initial translation, and utilize a trained LSTM-based validation model to assess whether each induced plan in the population adheres to the natural language specifications. We evaluate our approach on a collection of prototypical specifications in a notional naval disaster recovery task, and demonstrate that our evolutionary approach improve adherence of generated plans to natural language specifications when compared to plans generated using only LLM translations. The code for our method can be found at https://github.com/owenonline/PlanCritic.

Summary

The paper introduces a method that integrates LLM translations with PDDL planning refined by genetic algorithms to better meet human-specified objectives.
It employs an LSTM-based model to evaluate plan adherence and iteratively optimize candidate solutions through evolutionary operations.
Experimental evaluation in naval disaster response scenarios shows significant improvement in alignment of generated plans with human intent.

Aligning LLM+PDDL Symbolic Plans with Human Objective Specifications through Evolutionary Algorithm Guidance

The paper presents an innovative approach combining the strengths of LLMs and Planning Domain Definition Language (PDDL) symbolic planners to generate plans that align more closely with human-specified objectives. The introduction of an evolutionary algorithm into this process addresses inherent translation inaccuracies from natural language to PDDL, thereby improving plan fidelity to human intent.

Introduction

Automated symbolic planners, particularly those utilizing PDDL, have long facilitated optimal plan generation from formally specified domains and goals. However, translating human intent into these machine-processable formats remains a challenge. Recent advances using LLMs to bridge this gap have shown promise but fall short when initial translations contain errors that distort user intent. The proposed framework leverages LLM capabilities to convert natural language feedback into symbolic specifications, augmented by an evolutionary algorithm that iteratively refines plan adherence to specified human objectives.

Technical Approach

User Interaction and LLM Utilization

The system begins with an LLM translating user-provided natural language feedback into initial PDDL constraints. This initial translation often results in imprecise symbolic specifications that are subsequently refined. Users interact with an interface to submit feedback, which the LLM interprets into symbolic language grounded in the task domain. Table examples illustrate these transformations, emphasizing the translation's potential pitfalls in directly capturing nuanced human preferences.

Symbolic Planner and Genetic Algorithm Integration

A symbolic planner generates an initial plan based on translated specifications. The LLM's translations typically introduce specifications that may not effectively align with user intentions or domain constraints. Here, a genetic algorithm comes into play, systematically exploring variations of these specifications to enhance plan adherence. By generating a population of candidate solutions, the algorithm applies crossover and mutation operations, guided by adherence rates evaluated against an LSTM-based model.

Specification Adherence and Fitness Evaluation

The fitness of each evolved candidate solution is measured by a Specification Adherence Model, which quantifies the degree to which a plan adheres to original human intent. This model employs a neural architecture to evaluate the alignment of symbolic plans with feedback statements. Fitness scores then inform the selection and evolution of candidate specifications, optimizing plans over successive generations.

Evaluation and Results

Scenario Setup

The framework was tested in a naval disaster response scenario, simulating post-disaster environments requiring debris clearance. Domain-specific constraints such as asset movements and debris management illustrate the complexities of translating user objectives into operational plans.

Performance Analysis

Experimental results demonstrate the system's capability to generate plans more closely aligned with user objectives than those produced by LLM translations alone. The genetic algorithm significantly improved adherence rates for most constraint archetypes, although challenges remain with scenarios demanding extended plan horizons or where objectives include multiple disjoint actions.

Conclusion

The paper underscores the potential of integrating genetic algorithms with neurosymbolic frameworks to enhance plan adherence in dynamic environments. While results are promising, further exploration of adherence model architectures and the optimization of computational resource use remains essential. Future directions include refining the training dataset and exploring different neural architectures to improve robustness and scalability in complex planning domains.