2000 character limit reached
Human-In-The-Loop Software Development Agents: Challenges and Future Directions (2506.11009v1)
Published 25 Apr 2025 in cs.SE
Abstract: Multi-agent LLM-driven systems for software development are rapidly gaining traction, offering new opportunities to enhance productivity. At Atlassian, we deployed Human-in-the-Loop Software Development Agents to resolve Jira work items and evaluated the generated code quality using functional correctness testing and GPT-based similarity scoring. This paper highlights two major challenges: the high computational costs of unit testing and the variability in LLM-based evaluations. We also propose future research directions to improve evaluation frameworks for Human-In-The-Loop software development tools.