The paper "Prompt Space Optimizing Few-shot Reasoning Success with LLMs" addresses the challenge of effectively selecting and engineering prompts to enhance the reasoning abilities of LLMs. While prompt engineering techniques, such as Chain of Thought (CoT) prompting and Zero-shot CoT, have been extensively explored, they often lack a mathematically grounded method for determining optimal prompts. To tackle this, the authors propose a new approach called Prompt Space, which constructs a mathematical framework to optimize prompt selection, aiming to improve few-shot reasoning capabilities.
Key Contributions:
- Mathematical Framework: The paper introduces a novel Prompt Space methodology, which utilizes text embeddings and matrix decomposition (using SVD and PCA) to identify a basis set of prompt vectors. This basis set effectively spans the space of potential prompts for reasoning tasks.
- Empirical Evaluation:
- The Prompt Space method significantly outperforms baseline methods, including CoT, Zero-shot-CoT, and Auto-CoT, across ten public reasoning benchmarks, demonstrating a consistent improvement in task success rates.
- The approach effectively identifies the optimal number of basis questions for each reasoning task, showing significant improvements without relying on CoT or "Let's think step by step" prompts.
- Robustness Across Tasks: The proposed method is evaluated over a variety of reasoning tasks such as arithmetic reasoning (e.g., AddSub, MultiArith), commonsense reasoning (e.g., CommonsenseQA), and symbolic reasoning (e.g., Last Letter Concatenation). It consistently shows improved performance over traditional question-answer pair prompt designs.
- Impact of Embedding Models: The authors examine how different embedding models affect Prompt Space, finding that an appropriate embedding size is crucial for the model's improved performance. Various T5 and E5 models are tested to elucidate this aspect.
- Significance of Basis Questions: Prompt Space's exploration into the number of basis questions highlights that careful selection of these basis vectors pertinent to the task can be essential in enhancing the reasoning performance of LLMs.
Technical Details:
- Matrix Decomposition: SVD is employed to decompose the question embeddings into a basis vector space, facilitating the selection of basis questions that are most representative for task-related prompts.
- Prompt Creation: The selected basis questions are used to form prompt exemplars combined with the test question to guide the LLMs through reasoning steps needed for task completion.
The paper's evaluation indicates a reliable and robust methodology for selecting effective prompts in LLMs, advancing the frontier of prompt engineering towards a mathematically grounded and empirically validated approach.