Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

Gemini 2.5 Flash 94 tok/s

Gemini 2.5 Pro 57 tok/s Pro

GPT-5 Medium 28 tok/s

GPT-5 High 38 tok/s Pro

GPT-4o 100 tok/s

GPT OSS 120B 461 tok/s Pro

Kimi K2 208 tok/s Pro

2000 character limit reached

FLAP: Flow-Adhering Planning with Constrained Decoding in LLMs (2403.05766v3)

Published 9 Mar 2024 in cs.CL

Abstract: Planning is a crucial task for agents in task oriented dialogs (TODs). Human agents typically resolve user issues by following predefined workflows, decomposing workflow steps into actionable items, and performing actions by executing APIs in order; all of which require reasoning and planning. With the recent advances in LLMs, there have been increasing attempts to use them for task planning and API usage. However, the faithfulness of the plans to predefined workflows and API dependencies, is not guaranteed with LLMs. Moreover, workflows in real life are often custom-defined and prone to changes; hence, adaptation is desirable. To study this, we propose the problem of faithful planning in TODs that needs to resolve user intents by following predefined flows and preserving API dependencies. To solve this problem, we propose FLAP, a Flow-Adhering Planning algorithm based on constrained decoding with lookahead heuristic for LLMs. Our algorithm alleviates the need for finetuning LLMs using domain specific (plan/dependency) data, enables quick adaptation to predefined flows, and outperforms other decoding and prompting-based baselines. Further, our algorithm empowers smaller LLMs (7B) to perform at par larger LLMs (30B-40B).

References (55)

Citations (4)

View on Semantic Scholar

Collections

Summary

The paper introduces FLAP, which leverages constrained decoding with lookahead heuristics to enforce workflow and API dependency adherence.
FLAP significantly reduces planning errors, enabling smaller LLMs to achieve performance comparable to much larger models in task-oriented dialogs.
FLAP uses dynamic dependency graphs to adapt plans in real time, ensuring reliable execution of customized workflows in practical applications.

Understanding FLAP: Enhancing Task-Oriented Dialogs with Flow Adhering Planning in LLMs

Planning in Task-Oriented Dialogs (TODs)

In task-oriented dialogs, agents are often required to perform a sequence of actions, typically API calls, to fulfill user requests. This involves not just identifying what actions to perform but also understanding the correct sequence in which these actions should be executed, which can be quite challenging. Traditional approaches toward this involve leveraging LLMs directly to generate plans that satisfy the request. However, one significant limitation of directly using LLMs is their predisposition to deviate from predetermined workflows and API dependencies dictated by the specific task domain.

The FLAP Algorithm

The paper introduces FLAP (Flow Adhering Planning), a novel algorithm that enhances LLMs' ability to generate plans for TODs by adhering to predefined workflows and API dependencies. Unlike traditional approaches that might require retraining or fine-tuning LLMs with domain-specific data, FLAP operates using constrained decoding based on lookahead heuristics. This is particularly beneficial in real-world scenarios where workflows and API dependencies are often customized and subject to change.

FLAP's constrained decoding operates by maintaining dynamic dependency graphs for both APIs and workflow steps, enforcing that plans generated by LLMs preserve these dependencies. Through beam search lookahead, it scores potential next actions based on their alignment with the permitted actions inferred from the dependency graphs. Moreover, FLAP introduces several scoring components within its heuristic function, like the alignment of generated thoughts with permitted workflow steps and APIs, hence ensuring the high relevance of the generated plan with the actual task context.

Performance and Evaluation

The evaluation of FLAP was conducted on a novel dataset comprising various domains, intents, and associated workflows, showcasing its adaptability and efficiency across different scenarios. The results underscored that LLMs, when equipped with FLAP, significantly outperformed the baselines in faithful plan generation, notably reducing errors related to API and workflow step dependencies. Notably, applying FLAP on smaller LLMs (e.g., 7B parameters) yielded performance comparable to much larger models (30B-40B parameters), demonstrating FLAP's ability to enhance planning features without necessitating larger model sizes.

Implications and Forward Look

The introduction of FLAP opens up several avenues for future work, particularly in dynamic planning scenarios where plans may need to be adjusted in real-time based on conversational context or external events. Moreover, FLAP's ability to effectively utilize smaller LLMs for complex planning tasks hints at broader applicability in resource-constrained environments.

Looking ahead, further refinement of FLAP's constrained decoding strategies could provide even finer control over the planning process, enabling more nuanced adherence to complex workflows and dependencies. Additionally, integrating external knowledge sources or real-time API call feedback into FLAP's planning process could further enhance its effectiveness and reliability in practical applications.

In summary, FLAP represents a significant advance in leveraging LLMs for task-oriented dialogs, emphasizing the importance of structure and constraint adherence in plan generation tasks. Its development marks a step forward in the journey toward more effective, efficient, and adaptable automated dialog agents.

PDF Markdown

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

Authors (5)

Tweets

https://twitter.com/ShamiikRoy/status/1768007927342969281

https://twitter.com/sailiks/status/1820916612767346775