CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases (2408.03910v2)

Published 7 Aug 2024 in cs.SE, cs.AI, and cs.CL

Abstract: LLMs excel in stand-alone code tasks like HumanEval and MBPP, but struggle with handling entire code repositories. This challenge has prompted research on enhancing LLM-codebase interaction at a repository scale. Current solutions rely on similarity-based retrieval or manual tools and APIs, each with notable drawbacks. Similarity-based retrieval often has low recall in complex tasks, while manual tools and APIs are typically task-specific and require expert knowledge, reducing their generalizability across diverse code tasks and real-world applications. To mitigate these limitations, we introduce CodexGraph, a system that integrates LLM agents with graph database interfaces extracted from code repositories. By leveraging the structural properties of graph databases and the flexibility of the graph query language, CodexGraph enables the LLM agent to construct and execute queries, allowing for precise, code structure-aware context retrieval and code navigation. We assess CodexGraph using three benchmarks: CrossCodeEval, SWE-bench, and EvoCodeBench. Additionally, we develop five real-world coding applications. With a unified graph database schema, CodexGraph demonstrates competitive performance and potential in both academic and real-world environments, showcasing its versatility and efficacy in software engineering. Our application demo: https://github.com/modelscope/modelscope-agent/tree/master/apps/codexgraph_agent.

PDF HTML Abstract

CodexGraph: Bridging LLMs and Code Repositories via Code Graph Databases

The paper introduces CodexGraph, a system designed to enhance the interaction of LLMs with extensive code repositories through the utilization of graph database interfaces. CodexGraph effectively mitigates the limitations observed in current methods that rely heavily on similarity-based retrieval or manual tools and APIs.

Problem and Motivation

LLMs have exhibited remarkable proficiency in handling standalone code tasks, such as those found in HumanEval and MBPP. However, these models struggle significantly when faced with tasks that span entire code repositories, mainly due to the complex dependencies and extensive context these repositories encompass. Existing solutions, like similarity-based retrieval methods, often suffer from low recall rates in complex tasks, while tools and API-based approaches necessitate extensive domain expertise and task-specific adaptations, limiting their general applicability.

Proposed Solution: CodexGraph

CodexGraph uniquely integrates LLM agents with structured graph database interfaces derived from code repositories. This integration harnesses the inherent structural properties of graph databases and the versatility of graph query languages, enabling LLM agents to construct and execute precise, code-aware queries. The core components of CodexGraph include:

Graph Database Schema: CodexGraph employs a schema that abstracts code repositories into code graphs, where nodes symbolize code elements (e.g., modules, classes, functions) and edges represent relationships (e.g., inheritance, usage). This schema facilitates a structured representation of the codebase, supporting multi-granular searches and topological analysis.
Shallow Indexing and Edge Completion: The system initiates with a shallow indexing phase, performing a singular pass to capture symbols and their meta-information within the codebase. This is followed by an edge completion phase, employing depth-first search (DFS) to resolve cross-file relationships, thereby constructing a comprehensive code graph database.
LLM Interface Using Graph Query Language: CodexGraph enables LLM agents to generate and translate natural language queries into graph database queries. This write then translate strategy divides the task between generating the high-level understanding of queries and ensuring their syntactic correctness through a translation LLM agent. This enhances query generation accuracy and retrieval efficiency.
Iterative Pipeline for Task Execution: CodexGraph adopts an iterative query and retrieval approach, allowing LLM agents to progressively refine their queries based on the information retrieved. This iterative method enhances the system's capability to handle complex and multi-hop code reasoning tasks effectively.

Experimental Evaluation

CodexGraph's efficacy was assessed using three comprehensive and challenging benchmarks: CrossCodeEval, SWE-bench, and EvoCodeBench. The key findings from the experimental analysis include:

Performance Across Tasks: CodexGraph showcased competitive performance across these benchmarks. For example, on the CrossCodeEval Lite (Python) dataset, it demonstrated superior results (EM: 27.90%) compared to similarity-based retrieval (BM25) and other RACG methods (AutoCodeRover).
Query Strategy Effectiveness: Optimal querying strategies were found to differ across tasks. For CrossCodeEval Lite (Python), multiple queries per round enhanced recall, whereas, for SWE-bench, focusing on precision through single queries per round yielded better results.
Advancements with LLMs: CodexGraph's performance significantly improved with the use of more advanced LLMs, such as GPT-4. This suggests that CodexGraph's structured and flexible interface can effectively leverage the evolving capabilities of LLMs.
Token Consumption: The more extensive and complex queries generated by CodexGraph, while improving retrieval accuracy, do incur higher token costs compared to other RACG methods.

Implications

Practical Applications: CodexGraph's integration into real-world software development was demonstrated by developing five practical applications (Code Chat, Code Debugger, Code Unittestor, Code Generator, and Code Commentor) using the ModelScope-Agent framework. These applications showcased CodexGraph’s utility in enhancing code comprehension, debugging, automated testing, generation, and documentation in practical scenarios.

Future Developments: The success of CodexGraph opens up several avenues for future research and applications. Extending the schema to support additional programming languages and optimizing the indexing efficiency are immediate next steps. Furthermore, integrating advanced multi-agent collaboration techniques could further enhance the system's flexibility and performance across diverse code tasks.

Conclusion

CodexGraph represents a significant stride towards bridging the interaction between LLMs and code repositories. By leveraging graph database interfaces and advanced querying strategies, CodexGraph not only addresses the limitations of existing RACG methods but also paves the way for more versatile and powerful solutions in automated software engineering. The presented work underscores the potential of structured graph-based approaches in enhancing the scalability and effectiveness of LLMs in handling complex, real-world code repositories.

PDF Markdown Bookmark Chat (Pro)

Authors (8)

Xiangyan Liu (10 papers)
Bo Lan (17 papers)
Zhiyuan Hu (30 papers)
Yang Liu (2253 papers)
Zhicheng Zhang (76 papers)
Wenmeng Zhou (14 papers)
Fei Wang (573 papers)
Michael Shieh (9 papers)

Citations (6)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

Tweets

https://twitter.com/_akhaliq/status/1821372912441442602

https://twitter.com/gm8xx8/status/1821362087978742213

https://twitter.com/javaeeeee1/status/1821668761163985396

https://twitter.com/arXivGPT/status/1822000906952413482

https://twitter.com/fr00tflie/status/1911264118775963854