Applicability of AgentSZZ to Proprietary or Industrial Codebases

Ascertain whether AgentSZZ, an agent-based framework for identifying bug-inducing commits via task-specific git tools and a ReAct-style reasoning loop, maintains its effectiveness and reliability when applied to proprietary or industrial codebases that follow development practices different from those of the open-source projects used in the evaluation.

Background

AgentSZZ is evaluated on three developer-annotated, open-source datasets (Linux, GitHub, and Apache), where it outperforms eleven baselines and shows robustness across C and Java without dataset-specific tuning.

Despite these results, the study scope is limited to open-source repositories. Industrial or proprietary projects may involve different repository structures, workflows, documentation standards, and development practices, which could affect the agent’s investigation strategies and tool effectiveness. The authors therefore explicitly note uncertainty about generalizing the observed gains to such settings.

References

Finally, our evaluation focuses on open-source projects; the applicability of AgentSZZ to proprietary or industrial codebases with different development practices remains an open question.

AgentSZZ: Teaching the LLM Agent to Play Detective with Bug-Inducing Commits  (2604.02665 - Lyu et al., 3 Apr 2026) in Threats to Validity — External Validity (Section 6.2)