RepoLens: Code Abstraction for Localization
- RepoLens is a system that extracts and enriches conceptual knowledge from code repositories to enhance issue localization by clustering semantically coherent concerns.
- It employs offline extraction with AST parsing and LLM-based prompting, generating enriched term data to counter challenge of concern mixing and scattering.
- Integration with LLMs refines concern ranking and localization performance, demonstrating notable empirical improvements in both file-level and function-level tasks.
RepoLens is a system for abstracting and leveraging conceptual knowledge from code repositories to enhance issue localization—the process of identifying faulty code elements such as files or functions—in large, complex software projects. By decomposing fine-grained functionalities and recomposing them into high-level, semantically coherent concerns, RepoLens guides LLMs to achieve more effective localization. It operates through offline and online stages involving conceptual extraction, enrichment, concern clustering, and minimally invasive integration into standard LLM-based localization workflows, demonstrating significant improvements in empirical evaluation across multiple state-of-the-art tools (Wang et al., 25 Sep 2025).
1. Conceptual Knowledge Extraction
RepoLens mines code repositories to identify domain-specific conceptual terms derived from identifiers in function definitions. Using Abstract Syntax Tree (AST) parsing via Tree-sitter, the system traverses source files and extracts identifier elements, splitting them according to naming conventions (camelCase, snake_case) and tagging their parts of speech (using approaches such as POSSE). Noun words or noun phrases (e.g., "user name," "transaction id") are identified as conceptual terms.
Each term is subjected to multi-step LLM prompting for semantic enrichment:
- Expanded Name: Resolves abbreviations or aliases.
- Definition: Concisely describes what the term is, independent of usage.
- Term-Centric Functionality Summary: Isolates the function logic relevant to the term.
- Reference Code Snippet: Exemplifies the term’s role.
This process uses integrated chain-of-thought (CoT) prompts for cost-effective, coherent output. All enriched term data, with associated metadata (location, function), forms a knowledge base spanning the entire repository.
Similarity between terms (ntᵢ, ntⱼ) for initial concern pre-clustering is computed by:
Where the three fields are embedding-based cosine similarities over expanded names, definitions, and functionality summaries; call_bonus is set to 1 if a call relationship exists between functions containing ntᵢ and ntⱼ, else 0.
2. Stages of Operation
RepoLens is structured in two stages for efficient and accurate issue localization.
- Offline Stage: Conceptual knowledge extraction and enrichment, as described above, decomposes repository logic into enriched conceptual terms. These are stored in a knowledge base to counter concern mixing (multiple responsibilities within a function).
- Online Stage: For a given issue description (typically its title), issue-specific noun keywords are retrieved and matched against the knowledge base, selecting related conceptual terms. Concern clustering is performed—initially via similarity-based pre-clustering followed by LLM-based refinement—aggregating terms into semantically coherent "concerns" representing functional responsibilities. Concerns are ranked by relevance using embedding similarity and fine-grained LLM-based re-ranking. The top-N concerns are selected for integration.
This dual-stage pipeline supports real-time issue localization by focusing debugging and search efforts on the most relevant conceptual clusters.
3. Integration with LLMs
LLMs are employed throughout RepoLens:
- Extraction and Enrichment: LLMs expand abbreviated names, generate definitions, produce term-centric summaries, and extract code supporting the summary.
- Concern Clustering: LLMs aggregate functionalities into cohesive concerns based on formal definitions and instructions.
- Ranking: LLMs refine the prioritization of concerns relevant to a specific issue.
- Workflow Enhancement: When combined with localization tools, RepoLens augments prompts:
- For workflow-based systems (e.g., AgentLess), advice such as “pay special attention to files that appear in the Concern list and are semantically connected to the problem description” is appended.
- For agent-based systems (e.g., OpenHands, mini-SWE-agent), system prompts are extended to include concern summaries, file locations, and associated functions.
This minimally invasive augmentation ensures that LLMs directly benefit from the semantically structured conceptual map generated by RepoLens, improving their ability to navigate large, heterogeneous codebases.
4. Empirical Evaluation and Generalization
RepoLens was evaluated using the SWE-Lancer-Loc benchmark (216 tasks from the Expensify codebase, 2.04M LOC) for file-level and function-level issue localization. Baseline tools included AgentLess, OpenHands, and mini-SWE-agent across multiple LLMs (GPT‑4o, GPT‑4o‑mini, GPT‑4.1).
Key findings:
- File-level Localization: Average relative gains of over 22% in Hit@k and 46% in Recall@k. For example, AgentLess’s Hit@1 improved from 15.28% to 29.17%.
- Function-level Localization: Relative gains reach up to 504% in Hit@1 and 376% in Recall@10, depending on the base model.
- Generalization: RepoLens’ improvements are consistent across all baselines and LLMs tested.
- Ablation: Removal of term explanation (w/o Exp) or concern clustering (w/o Con) yields lower performance than the complete pipeline (e.g., Hit@1 on AgentLess reduces from 23.15% to below 20.91%).
Manual evaluation confirms that over 97% of concerns are correct and complete (average developer score 3.71/4), indicating reliability and quality.
5. Addressing Concern Mixing and Concern Scattering
RepoLens specifically mitigates two challenges in large-scale codebases:
- Concern Mixing: Relevant logic for an issue is often embedded within complex, multi-responsibility functions. By extracting fine-grained term-centric functionalities and grouping relevant ones into coherent concerns, RepoLens isolates just the semantic components required for accurate localization.
- Concern Scattering: Functional logic pertaining to an issue may be distributed across many files. Concern clustering brings together interdependent code fragments, constructing a conceptual map that focuses search and debugging efforts, reducing the need for exhaustive traversal.
Both mechanisms improve the ability of LLM-based and agent-based localization tools to surface the root cause of an issue effectively.
6. Connections to Code Reflection and Reproducibility Assessment
RepoLens incorporates and extends insights from adjacent research directions:
- Repository-Based Code Reflection: The LiveRepoReflection benchmark and RepoReflection-Instruct dataset (Zhang et al., 14 Jul 2025) emphasize multi-file code understanding, modification, and repair by LLMs within realistic repository structures. The two-turn dialogue pipeline (generation and error-driven repair) and metrics (Pass@1, Pass@2, Fix Weight) characterize model performance on reflective programming tasks. RepoLens generalizes these approaches by organizing conceptual knowledge and concerns that improve reflective debugging and contextual repair during localization.
- Automated Reproducibility Assessment: Section-based Readme classification and hierarchical transformer scoring (Akdeniz et al., 2023) allow RepoLens to quantitatively assess documentation quality and reproducibility checklist conformance at the repository level, providing actionable feedback for maintainers and reviewers.
- Backend Platform for Experiment Reproducibility: By leveraging reproducibility package generation, language and dependency inference, and workflow encapsulation (Costa et al., 2023), RepoLens can interface with reproducibility platforms to ensure that localization findings, experiment code, and computational environments remain portable and verifiable.
7. Future Directions and Prospects
Suggested advancements for RepoLens include:
- Integration with advanced program analysis techniques such as program slicing, control-flow/data-flow analysis, and dynamic execution tracing, which would enhance the precision of concern formation and identification of subtle code dependencies.
- Development of dedicated, concern-aware localization agents, moving beyond prompt enhancements toward architectures that natively exploit concern structure, potentially yielding further performance improvements.
- Expansion to heterogeneous repository scenarios—incorporating real-world projects, third-party library integrations, and large-scale multilingual codebases.
- Exploration of reinforcement learning from human feedback (RLHF, DPO) for improved alignment with human debugging strategies, particularly in iterative and interactive repair workflows.
RepoLens thus establishes a foundation for conceptual abstraction and intelligent localization in software repositories, with empirical validation and systematic rigor. Its approach addresses demonstrated challenges in contemporary LLM-based localization—concern mixing and scattering—while providing a framework extensible to broader directions in reproducibility, code reflection, and repository analysis.