Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Facilitating Trustworthy Human-Agent Collaboration in LLM-based Multi-Agent System oriented Software Engineering (2505.04251v1)

Published 7 May 2025 in cs.SE, cs.AI, and cs.MA

Abstract: Multi-agent autonomous systems (MAS) are better at addressing challenges that spans across multiple domains than singular autonomous agents. This holds true within the field of software engineering (SE) as well. The state-of-the-art research on MAS within SE focuses on integrating LLMs at the core of autonomous agents to create LLM-based multi-agent autonomous (LMA) systems. However, the introduction of LMA systems into SE brings a plethora of challenges. One of the major challenges is the strategic allocation of tasks between humans and the LMA system in a trustworthy manner. To address this challenge, a RACI-based framework is proposed in this work in progress article, along with implementation guidelines and an example implementation of the framework. The proposed framework can facilitate efficient collaboration, ensure accountability, and mitigate potential risks associated with LLM-driven automation while aligning with the Trustworthy AI guidelines. The future steps for this work delineating the planned empirical validation method are also presented.

Overview of Trustworthy Human-Agent Collaboration in LLM-Based Multi-Agent Systems for Software Engineering

The paper "Facilitating Trustworthy Human-Agent Collaboration in LLM-based Multi-Agent System oriented Software Engineering," presented at the ACM International Conference on the Foundations of Software Engineering, addresses the pivotal challenge of integrating LLM-based Multi-Agent (LMA) systems within software engineering (SE). Multi-Agent Systems (MAS) offer notable advantages in handling complex challenges across various domains, including SE. Specifically, the focus is on LMA systems that utilize LLMs to automate and optimize diverse software development tasks. This paper introduces a framework based on the RACI (Responsible, Accountable, Consulted, Informed) matrix to strategically allocate tasks between humans and LMA systems in a manner that aligns with Trustworthy AI guidelines.

Framework Highlights and Methodology

The authors propose a structured approach to human-agent task allocation, leveraging the RACI matrix to define clear roles and responsibilities within the software development lifecycle (SDLC). This framework provides implementation guidelines to facilitate collaboration, ensure accountability, and mitigate risks inherent in LLM-driven automation.

Key steps include:

  1. Identifying artefact-based tasks suitable for automation.
  2. Listing human actors and LLM-agents involved.
  3. Assessing regulatory constraints to ensure compliance.
  4. Assigning roles (R, A, C, or I) based on task suitability and regulatory considerations.
  5. Designing workflows that incorporate human oversight and validation mechanisms.

The proposed framework emphasizes flexibility, allowing adaptations based on organizational processes and constraints.

Example Implementation

An illustrative example considers the Planning phase of the DevOps framework. Assignments are made for tasks such as requirements elicitation, product roadmap creation, feature and user story development, and sprint planning. For instance, LLM-agent B is responsible for generating features and user stories, with a human business analyst accountable for verifying outputs. This approach optimizes collaboration between humans and autonomous agents while maintaining compliance with AI guidelines.

Implications and Future Directions

This paper contributes significantly to the discourse on trustworthy AI in software engineering by offering a practical solution for role delineation in LMA systems. The framework enhances efficiency by leveraging specialized LLM-agents for automating complex processes while ensuring human oversight and accountability. Practical implications include improved task execution, enhanced collaboration, and adherence to ethical AI standards.

The authors outline plans for future empirical validation through a groupware walkthrough-based multi-case paper, aiming to refine the framework based on expert feedback and real-world applications. As AI models continue to evolve, future research may explore extensions of this framework to accommodate new technologies and methodologies, further bridging the gap between human expertise and autonomous capabilities in software engineering.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Krishna Ronanki (7 papers)
Youtube Logo Streamline Icon: https://streamlinehq.com