Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

102 tokens/sec

GPT-4o

59 tokens/sec

Gemini 2.5 Pro Pro

43 tokens/sec

o3 Pro

6 tokens/sec

GPT-4.1 Pro

50 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

13 1 111

Guiding Large Language Models via Directional Stimulus Prompting (2302.11520v4)

Published 22 Feb 2023 in cs.CL

Abstract: We introduce Directional Stimulus Prompting, a novel framework for guiding black-box LLMs toward specific desired outputs. Instead of directly adjusting LLMs, our method employs a small tunable policy model (e.g., T5) to generate an auxiliary directional stimulus prompt for each input instance. These directional stimulus prompts act as nuanced, instance-specific hints and clues to guide LLMs in generating desired outcomes, such as including specific keywords in the generated summary. Our approach sidesteps the challenges of direct LLM tuning by optimizing the policy model to explore directional stimulus prompts that align LLMs with desired behaviors. The policy model can be optimized through 1) supervised fine-tuning using labeled data and 2) reinforcement learning from offline or online rewards based on the LLM's output. We assess our method across summarization, dialogue response generation, and chain-of-thought reasoning tasks. Our experiments demonstrate that the framework consistently improves LLMs' (e.g., ChatGPT, Codex, InstructGPT) performance on these supervised tasks using minimal labeled data. Notably, using just 80 dialogues on the MultiWOZ dataset, our approach enhances ChatGPT's performance by an impressive 41.4%, matching or surpassing some fully supervised start-of-the-art models. Additionally, the instance-specific chain-of-thought prompt generated by our approach improves InstructGPT's reasoning accuracy compared to human-crafted or automatically generated prompts. The code and data are publicly available at \url{https://github.com/Leezekun/Directional-Stimulus-Prompting}.

PDF HTML Abstract

Introduction to Directional Stimulus Prompting

LLMs have revolutionized the landscape of natural language processing, advancing the field with impressive capabilities that were absent in earlier LLMs. However, direct optimization of LLMs for specific tasks remains a daunting challenge, especially since these models are often only available through black-box API access. Additionally, the large-scale nature of these models presents both cost and accessibility barriers. As an alternative to direct model modification, research efforts have turned toward optimizing the prompts used to interact with LLMs.

A Novel Approach with Directional Stimulus

To refine the guidance provided to LLMs, a novel framework, Directional Stimulus Prompting (DSP), is introduced. Unlike prior works that relied on task-specific instructions or external knowledge augmentation, DSP integrates "directional stimulus" or hints into prompts. The directional stimulus offers instance-specific cues that steer LLMs toward desired outcomes. This method presents a smart way to generate outputs that align better with specific references or goals.

Policy Model Training and Reinforcement Learning

To create this directional stimulus, a smaller, tunable policy model, such as T5, is used. This maneuver allows for evasion of the complexities involved in modifying the LLMs directly. This policy model is first trained using a supervised fine-tuning approach with labeled data. Subsequently, it undergoes reinforcement learning optimization to discover more effective stimulus prompts that yield high rewards measured by LLM performance metrics or human preference.

Empirical Assessment of the Framework

The DSP framework's effectiveness was appraised on tasks including summarization, dialog response generation, and chain-of-thought reasoning. Noteworthy results were observed: introducing keywords as directional stimuli increased the performance of ChatGPT, and for dialog response generation tasks, performance improved by over 40% in specific metrics. The framework proved adept in guiding LLMs to achieve desired outcomes, demonstrating potential for versatile applications across LLMs and varying tasks.

PDF Markdown Bookmark Chat (Pro)

References (80)

Authors (6)

Zekun Li (73 papers)
Baolin Peng (72 papers)
Pengcheng He (60 papers)
Michel Galley (50 papers)
Jianfeng Gao (344 papers)
Xifeng Yan (52 papers)

Citations (75)

View on Semantic Scholar

GitHub

GitHub - Leezekun/Directional-Stimulus-Prompting: [NeurIPS 2023] Codebase for the paper: "Guiding Large Language Models with Directional Stimulus Prompting" (111 stars)

Tweets

https://twitter.com/youraimarketer/status/1854262826329952269

https://twitter.com/gravity7/status/1767621546338914339

YouTube

Show All Videos