Repository-Level Prompt Generation for Large Language Models of Code (2206.12839v3)

Published 26 Jun 2022 in cs.LG, cs.AI, cs.PL, and cs.SE

Abstract: With the success of LLMs of code and their use as code assistants (e.g. Codex used in GitHub Copilot), techniques for introducing domain-specific knowledge in the prompt design process become important. In this work, we propose a framework called Repo-Level Prompt Generator that learns to generate example-specific prompts using prompt proposals. The prompt proposals take context from the entire repository, thereby incorporating both the structure of the repository and the context from other relevant files (e.g. imports, parent class files). Our technique doesn't require any access to the weights of the LLM, making it applicable in cases where we only have black-box access to the LLM. We conduct experiments on the task of single-line code-autocompletion using code repositories taken from Google Code archives. We demonstrate that an oracle constructed from our prompt proposals gives a remarkably high relative improvement of 36% over Codex, showing the quality of these proposals. Further, we show that when we train a model to predict a prompt proposal, we can achieve significant performance gains over Codex and other baselines. We release our code, data, and trained checkpoints at: \url{https://github.com/shrivastavadisha/repo_level_prompt_generation}.

PDF HTML Abstract

Repository-Level Prompt Generation for LLMs of Code

The paper presents a novel framework for enhancing the performance of LLMs of code, particularly focusing on generating effective prompts by utilizing repository-level information. Entitled "Repository-Level Prompt Generation for LLMs of Code," the research proposes a system called the Repo-Level Prompt Generator (RLPG). This system is designed to create prompts that are example-specific by harnessing context from an entire code repository. Such context might include structural elements and relevant details from various files such as imports and parent class files, which are not confined to the file containing the code to be completed.

The proposed framework does not necessitate access to the internal weights of the LLM, which makes it applicable in scenarios where only black-box access to the model is available. This is particularly useful since many state-of-the-art LLMs, such as OpenAI's Codex, only provide API access for generating outputs without exposing model weights.

The authors conducted experiments focusing on the task of single-line code auto-completion using repositories obtained from the Google Code archives. These experiments demonstrated that leveraging the RLPG framework resulted in significant improvements over the baseline performance of Codex. More specifically, an oracle experiment revealed a 36% relative improvement in successful code completions compared to using Codex alone. When trained using their prompt proposal classifier, the framework achieved up to a 17% improvement over Codex and other baseline methods.

Methodology

Repo-Level Prompt Proposals: The RLPG framework utilizes a set of prompt proposals designed to capture contextual information from a repository. These proposals are composed of various combinations of:
- Prompt Sources: This includes selecting relevant context from the current file, parent class files, import files, sibling files, files with similar names, among others.
- Prompt Context Types: This specifies what to extract, such as identifiers, method names, and bodies, string literals, or field declarations.

The framework incorporates domain-specific knowledge by drawing from these structured prompt proposals, allowing for diverse prompts tailored per example.

Prompt Proposal Classifier (PPC): RLPG includes a machine learning model that predicts which prompt proposal will most likely produce a successful completion for a given code hole. Two variants of this model were explored: RLPG-H, which uses the hole context representation, and RLPG-R, which includes similarity modeling with a multi-headed attention mechanism.
Prompt Composer: This component combines the selected prompt proposal context with the default context that Codex uses, adjusting dynamically based on context length constraints.

Implications and Future Directions

The proposed framework provides a mechanism for automatically generating more effective prompts without altering the LLM's weights, highlighting its versatility and practical application, especially in environments that strictly control access to models. The successful integration of repository-level context in prompt generation represents a significant stride in code modeling, suggesting that similar approaches might benefit other domains, such as question answering and multi-document summarization, where structured context retrieval is crucial.

Potential future developments might focus on scaling this framework to handle larger context lengths and experimenting with prompt generation for multi-line code auto-completion tasks. Moreover, exploring ways to incorporate this framework into environments with proprietary software or developing tailored adaptations for unique organizational coding practices could further extend its applicability.

Overall, the research offers a promising avenue for augmenting LLMs of code by systematically harnessing the untapped potential of repository-level information. The proposed prompts can leverage external contexts, making LLMs more effective even in tasks they are not explicitly fine-tuned to perform, thereby advancing the capabilities of AI-assisted programming tools.

PDF Markdown Bookmark Chat (Pro)

References (52)

Authors (3)

Disha Shrivastava (15 papers)
Hugo Larochelle (87 papers)
Daniel Tarlow (41 papers)

Citations (109)

View on Semantic Scholar

YouTube

Show All Videos

Repository-Level Prompt Generation for Large Language Models of Code (2206.12839v3)

Repository-Level Prompt Generation for LLMs of Code

Methodology

Implications and Future Directions

Related Papers

YouTube