InspectCoder: Agentic Code Repair
- InspectCoder is a dynamic analysis–enabled, agentic program repair system that empowers LLMs to identify and fix buggy code through interactive debugger sessions.
- It employs a dual-agent framework where a Program Inspector manages breakpoints and runtime state inspection while a Patch Coder synthesizes verified repairs.
- Empirical results demonstrate significant improvements in bug repair accuracy and efficiency, transforming traditional trial-and-error debugging into a targeted, iterative process.
InspectCoder is a dynamic analysis–enabled, agentic program repair system that empowers LLMs to diagnose and fix buggy code by controlling debugger sessions and performing systematic, interactive root cause analysis. Unlike previous LLM-based self-repair frameworks that either rely on static semantic analysis or minimally guided log collection, InspectCoder enables the LLM to actively place breakpoints, inspect and modify intermediate program states, and conduct incremental experiments within a stateful debugger session. This transforms the debugging paradigm from trial-and-error into a deliberate process of runtime investigation. InspectCoder incorporates a dual-agent framework, integrates open-source InspectWare middleware for debugger abstraction, and demonstrates substantial advancements on challenging self-repair benchmarks.
1. Agentic Debugging and Self-Repair Paradigm
InspectCoder is fundamentally designed to address the diagnostic limitations of static or log-based LLM repair approaches by giving the LLMs access to dynamic analysis via direct debugger interaction. In the self-repair workflow, given a coding requirement , a failing test , and an LLM-generated buggy implementation , the goal is to synthesize a corrected version such that all provided tests pass: LLMs operate in a loop of generating candidate repairs, running them against tests, and, if failures remain, invoking InspectCoder's dynamic analysis capabilities for root cause determination and iterative patching.
2. Dual-Agent Framework and Dynamic Analysis
The InspectCoder architecture is composed of two collaborative agents:
- Program Inspector: This agent, built on the ReAct (reasoning and acting) paradigm, manages a dialogue with a live Python debugger session, issuing actions such as breakpoint placement, variable inspection, and runtime intervention. It incrementally builds a hypothesis of the bug’s root cause based on dynamic execution evidence, delivering a structured analysis report.
- Patch Coder: This agent consumes the Inspector’s report to synthesize a candidate patch, applies the patch, and verifies it against the test suite. If the repair is unsuccessful, control returns to the Program Inspector for further interactive analysis, creating a feedback loop until the bug is resolved.
This dual-agent orchestration is critical for transforming static LLM repair into a feedback-driven, runtime-aware process that mimics expert debugging strategies.
3. Strategic Breakpoint Placement and Runtime State Experimentation
A defining aspect of InspectCoder is its data-driven breakpoint selection and runtime state manipulation:
- Breakpoint Placement: Rather than using fixed logging or blanket instrumentation, the Program Inspector dynamically identifies suspicious code locations (e.g., branch points, assignments with suspect values) and assigns breakpoints to facilitate stepwise execution control and targeted data inspection.
- Incremental State Inspection and Perturbation: Upon hitting a breakpoint, the agent can query the values of variables and expressions, build temporal data traces to track how errors propagate, and even conduct runtime modification (inserting or altering variable values or code) within the session to experimentally verify hypotheses. This capability allows the system to validate if hypothesized patches would repair the failure before committing changes to the source.
Such interactivity is made feasible by InspectWare’s robust management of stateful debugging sessions, which abstracts mode transitions and enforces session coherence.
4. Experimental Evaluation and Efficiency
InspectCoder’s efficacy was validated on BigCodeBench-R and LiveCodeBench-R, two demanding self-repair benchmarks characterized by complex logic bugs and challenging test suites. Empirical results indicate:
- Repair Accuracy Gains: InspectCoder achieved relative improvements in bug resolve rate between 5.10% and 60.37% over the strongest static or log-based LLM self-repair baselines.
- Bug-Fix Efficiency: The system demonstrated 1.67× to 2.24× higher bug-fix throughput (measured as fixes per hour), primarily due to reduced patching rounds and efficient debugging action selection.
- Patch Quality: The adoption of runtime root cause analysis led to unique bug fixes unattainable by static approaches, notably on cases involving erroneous control flow, complex data dependencies, and mutually interacting variables.
These results underline the practical benefits of engaging LLMs in interactive, stepwise dynamic analysis instead of static, single-pass repair strategies.
5. InspectWare Middleware: Abstraction and Integration
A core technical enabler is the InspectWare middleware, which serves several functions:
| Role | Description | Benefit |
|---|---|---|
| Session Management | Maintains debugger context and manages transitions across execution modes | Prevents session corruption |
| Enhanced Runtime Modification | Exposes assistant-friendly APIs for breakpoints, execution control, and injection | Enables efficient experimentation |
| Cross-Framework Compatibility | Normalizes debugger interaction across unittest, pytest, and competitive formats | Broad applicability and integration |
The abstraction provided by InspectWare is essential for making the system extensible, reliable, and agnostic to specific underlying testing harnesses or execution environments.
6. Implications for Automated Debugging and Future Automated Software Engineering
InspectCoder’s integration of LLMs with dynamic debugger control marks a substantive evolution in program repair:
- Systematic Root Cause Diagnosis: Immediate process feedback (“process rewards”) from runtime experiments steers multi-step reasoning, encouraging hypothesis-driven investigation rather than pattern-based or log-matching fixes.
- Bridging Static and Dynamic Analysis: The system blends the strengths of symbolic, semantic code analysis with runtime evidence, promoting both fix accuracy and explainability.
- Foundations for Next-Generation Tools: The demonstrated technical framework offers a model for future LLM-driven IDE assistants, automated code review systems, and continuous integration pipelines that rely on interactive, adaptive root cause diagnosis.
A plausible implication is that future research might focus on LLMs trained or adapted specifically for debugging workflows, and on generalizing InspectCoder’s agentic and dynamic inspection capabilities to non-Python languages and non-deterministic, event-driven applications.
7. Conclusion
InspectCoder introduces an agentic, dynamic analysis–enabled self-repair system that operationalizes interactive root cause diagnosis through active LLM-debugger collaboration. Its dual-agent architecture, capability for strategic breakpoint control and runtime experimentation, and robust InspectWare middleware yield substantial gains in bug resolve rate and repair efficiency when evaluated on rigorous benchmarks. This work signals a transition in automated program repair: from static, pattern-based approaches to feedback-driven, adaptive systems capable of emulating key aspects of expert human debugging (Wang et al., 21 Oct 2025).