Papers
Topics
Authors
Recent
2000 character limit reached

Graph-Based Learning for Function Localization

Updated 4 January 2026
  • Graph-based function localization is a methodology that represents codebases as structured graphs to precisely identify functions needing modifications.
  • It leverages multi-hop reasoning, graph traversal, and LLM-guided search to uncover complex causal relationships among code entities.
  • Frameworks like LocAgent and GraphLocator demonstrate state-of-the-art accuracy with improved recall and cost-efficiency on extensive real-world benchmarks.

Graph-based learning for function-level localization refers to the set of methodologies that employ structured graph representations of codebases to precisely identify functions in need of modification, typically in response to natural language issue descriptions. These approaches leverage the hierarchical and relational nature of source code—spanning directories, files, classes, and functions—and use graph traversal, causal inference, and semantic search, often interfacing with LLMs. The two leading frameworks in this domain, LocAgent (Chen et al., 12 Mar 2025) and GraphLocator (&&&1&&&), demonstrate distinct paradigms, yet both achieve state-of-the-art function-level localization in extensive real-world benchmarks.

1. Graph Construction and Repository Representation

Graph-based localization pipelines commence by parsing software repositories into multi-layered heterogeneous graphs. LocAgent generates a directed heterogeneous graph G=(V,E)G=(V,E), where nodes vv represent directories, files, classes, or functions, and edges ee encode relations such as contain, import, invoke, and inherit. Function-level nodes preserve complete source code as documents, supporting rich semantic queries.

GraphLocator introduces the Repository Dependency Fractal Structure (RDFS),

R=(V,E,T,C,type,code,layer),\mathcal{R} = (V, E, T, C, \mathit{type}, \mathit{code}, \mathit{layer}),

organizing nodes by their type and granularity layer. Edge types (HasMember, ImportedBy, UsedBy, ExtendedBy) capture both vertical and horizontal dependencies, and static analysis is leveraged on-demand for computationally intensive relations. This graph is maintained in a database, facilitating efficient node and edge retrieval during the localization process.

Both frameworks rely on sparse indexing—keyword and ID lookups for LocAgent, and fuzzy name/string matching in GraphLocator—to enable rapid entity search across millions of code entities.

2. LLM-Driven Multi-Hop Reasoning and Graph Traversal

LocAgent employs a fine-tuned Qwen-2.5-Coder-Instruct-32B agent architected to reason over graph-structured contexts, using specialized tool calls (e.g., SearchEntity, TraverseGraph, RetrieveEntity) and chain-of-thought prompting. The agent performs multi-hop subgraph traversal, aggregating both direct and indirect evidence from directory, file, class, and function relationships:

hf(t+1)=σ(rRuNr(f)Wrhu(t))h_f^{(t+1)} = \sigma\left(\sum_{r \in R} \sum_{u \in N_r(f)} W_r \cdot h_u^{(t)}\right)

where Nr(f)N_r(f) are neighbors by relation rr, and WrW_r is a learned transform. In runtime, the Qwen agent interprets tree-formatted subgraph text returned by TraverseGraph calls, effectively performing prompt-driven message passing that substitutes for explicit graph neural network embedding.

GraphLocator utilizes an LLM-driven graph reasoning agent ("CausalAgent"), which incrementally grows a Causal Issue Graph (CIG) from symptom vertices through abductive reasoning. Each step expands nodes and edges in the CIG according to estimated causal strengths, directing focus to those sub-issues and code entities most likely to underlie observed symptoms. An explicit priority queue manages expansion order using the score:

Ψ(x)=1(x,y)Y(1ψ(x,y))\Psi(x) = 1 - \prod_{(x,y)\in\mathcal{Y}} (1-\psi(x,y))

with ψ(x,y)\psi(x,y) as the LLM-judged probability that xx causes yy. This workflow robustly handles multi-hop (symptom-to-cause) chains and issues spanning multiple interdependent functions.

3. Causal Reasoning and Sub-Issue Disentangling

GraphLocator’s CIG formalism decomposes an issue II into a set X\mathcal{X} of sub-issues, each causally linked (Y\mathcal{Y}) and grounded (ϕ\phi) to code entities. Through dynamic expansion, the system locates initial symptom vertices and iteratively discovers deeper, causally implicated functions. This process formally models the semantic gap between observed symptoms and root causes, as well as complex one-to-many mappings where a single issue implicates multiple regions of code.

Symptom-to-cause mismatch is addressed by expanding from observed symptoms across causal paths in the graph. One-to-many mismatch is handled by growing a forest of sub-issues, each localized to relevant functions, thereby achieving high recall even in the presence of distributed bugs or feature requests.

4. Scoring, Ranking, and Output Aggregation

LocAgent aggregates its final candidate functions via reciprocal-rank self-consistency over multiple LLM runs:

score(f)=t=1T1rankt(f)\mathrm{score}(f) = \sum_{t=1}^T \frac{1}{\mathrm{rank}_t(f)}

ensuring reliability through consensus across diverse chain-of-thought trajectories. Decoding constraints mandate that all agent outputs are valid code entity IDs, and functions are sorted by score.

GraphLocator does not train new parameters, relying solely on the pretrained LLM’s next-token prediction scores and prompt designs. Expansion order and candidate prioritization are dictated by Ψ(x)\Psi(x) within the evolving CIG, and relevance scoring is decided internally by the LLM.

5. Evaluation Protocols and Benchmark Results

Function-level localization performance is measured by accuracy (Acc@k\mathrm{Acc}@k), recall, precision, and F1. For LocAgent:

  • SWE-Bench-Lite: Function Acc@5=71.90%\mathrm{Acc}@5=71.90\%, Acc@10=77.01%\mathrm{Acc}@10=77.01\%
  • Loc-Bench: Function Acc@10=60.25%\mathrm{Acc}@10=60.25\%, Acc@15=62.10%\mathrm{Acc}@15=62.10\%
  • Downstream Pass@10 for bug-fix increases by 12% over prior agentic baselines

For GraphLocator:

  • Claude-3.5 model: Function-level recall improvement +19.49+19.49 pp (from ~53.90% to 73.39%), precision +11.89+11.89 pp over previous best
  • Specialized handling of multi-hop: recall drop for graph-distance >1>1 is mitigated (–10 pp vs –25 pp for baselines)
  • Robust recovery of distributed issues (≥3 functions): ~60% recall vs ~40% for other LLM agents

Both frameworks surpass previous embedding-based and agentless LLM approaches in rigorous multi-repository, multi-language evaluations.

6. Implementation and Cost-Efficiency

LocAgent's implementation parses repositories into graphs, constructs sparse indices mapping entity names and code chunks to node IDs, and automates the localization loop via programmatic prompts and tool executions. The agent loop and tool functions are defined in pseudocode, illustrating integration with BM25 search, graph traversal, and code retrieval.

A detailed cost analysis reveals substantial efficiency gains: using open-source Qwen models, LocAgent reduces per-query API cost by 86% while reproducing proprietary model performance. Specifically, Qwen-7B (fine-tuned) costs \$0.05/query at FunctionAcc@10=64.23%\mathrm{Acc}@10=64.23\%, and Qwen-32B (fine-tuned) achievesAcc@10=77.01%\mathrm{Acc}@10=77.01\%for \$0.09/query. Ablations underscore the necessity of graph-guided multi-hop reasoning and search functions for optimal accuracy.

GraphLocator is architected for zero-training deployment; all learning is encapsulated in pretrained GPT-4o or Claude-3.5 weights. No parameter optimization occurs beyond prompt and CIG priority score tuning.

7. Handling Core Software Localization Challenges

Both LocAgent and GraphLocator directly address two longstanding challenges: bridging symptom-to-cause semantic gaps and resolving one-to-many function mappings. The multi-layered graphs (RDFS or heterogeneous repository graphs), prompt-engineered LLM reasoning, and dynamic causal graph expansions together facilitate robust, high-recall localization even when issue descriptions are ambiguous or distributed across diverse code regions.

Experimental evidence indicates that graph-based LLM-agent frameworks can substantially outperform prior neural ranking and procedural baselines, especially as codebase complexity and interdependence increase. This suggests broader applicability for automated software engineering tasks requiring principled multi-hop reasoning over structured code data.

Comparison of Approaches

Framework Graph Structure LLM Integration Output Aggregation
LocAgent (Chen et al., 12 Mar 2025) Directed heterogeneous, edges: contain, import, invoke, inherit Fine-tuned Qwen-32B agent with prompt-driven tools Reciprocal-rank scoring over T runs
GraphLocator (Liu et al., 27 Dec 2025) 4-layer RDFS, Causal Issue Graph (CIG) discovery Pretrained GPT-4o, Claude-3.5 with dynamic graph-expansion prompts Priority queue Ψ(x)\Psi(x), CIG node discovery

Both frameworks eschew explicit GNN layers in favor of LLM-directed reasoning over graph-serialized context and structured prompts.

Significance and Plausible Implications

The prominent adoption of graph-based learning for function-level localization, as instantiated by LocAgent and GraphLocator, signals a methodological shift toward agentic, graph-guided, LLM-based systems in code intelligence. These approaches demonstrate the utility of combining symbolic graph methods with semantic LLM reasoning, yielding substantial improvements in localization accuracy, computational cost, and scalability. A plausible implication is the emergence of generalized graph-guided agents as central tools for automated code maintenance, bug triage, and feature attribution in large heterogeneous codebases.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Graph-Based Learning for Function-Level Localization.